Search CORE

48 research outputs found

Adapting the Neural Encoder-Decoder Framework from Single to Multi-Document Summarization

Author: Lebanoff Logan
Liu Fei
Song Kaiqiang
Publication venue
Publication date: 01/01/2018
Field of study

Generating a text abstract from a set of documents remains a challenging task. The neural encoder-decoder framework has recently been exploited to summarize single documents, but its success can in part be attributed to the availability of large parallel data automatically acquired from the Web. In contrast, parallel data for multi-document summarization are scarce and costly to obtain. There is a pressing need to adapt an encoder-decoder model trained on single-document summarization data to work with multiple-document input. In this paper, we present an initial investigation into a novel adaptation method. It exploits the maximal marginal relevance method to select representative sentences from multi-document input, and leverages an abstractive encoder-decoder model to fuse disparate sentences to an abstractive summary. The adaptation method is robust and itself requires no training data. Our system compares favorably to state-of-the-art extractive and abstractive approaches judged by automatic metrics and human assessors.Comment: 11 page

arXiv.org e-Print Archive

Crossref

Structure-Infused Copy Mechanisms for Abstractive Summarization

Author: Liu Fei
Song Kaiqiang
Zhao Lin
Publication venue
Publication date: 01/01/2018
Field of study

Seq2seq learning has produced promising results on summarization. However, in many cases, system summaries still struggle to keep the meaning of the original intact. They may miss out important words or relations that play critical roles in the syntactic structure of source sentences. In this paper, we present structure-infused copy mechanisms to facilitate copying important words and relations from the source sentence to summary sentence. The approach naturally combines source dependency structure with the copy mechanism of an abstractive sentence summarizer. Experimental results demonstrate the effectiveness of incorporating source-side syntactic information in the system, and our proposed approach compares favorably to state-of-the-art methods.Comment: 13 page

arXiv.org e-Print Archive

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Controlling the Amount of Verbatim Copying in Abstractive Summarization

Author: Feng Zhe
Liu Fei
Ren Liu
Song Kaiqiang
Wang Bingqing
Publication venue
Publication date: 23/11/2019
Field of study

An abstract must not change the meaning of the original text. A single most effective way to achieve that is to increase the amount of copying while still allowing for text abstraction. Human editors can usually exercise control over copying, resulting in summaries that are more extractive than abstractive, or vice versa. However, it remains poorly understood whether modern neural abstractive summarizers can provide the same flexibility, i.e., learning from single reference summaries to generate multiple summary hypotheses with varying degrees of copying. In this paper, we present a neural summarization model that, by learning from single human abstracts, can produce a broad spectrum of summaries ranging from purely extractive to highly generative ones. We frame the task of summarization as language modeling and exploit alternative mechanisms to generate summary hypotheses. Our method allows for control over copying during both training and decoding stages of a neural summarization model. Through extensive experiments we illustrate the significance of our proposed method on controlling the amount of verbatim copying and achieve competitive results over strong baselines. Our analysis further reveals interesting and unobvious facts.Comment: AAAI 2020 (Main Technical Track

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Recommended from our members

Global Monsoon Precipitation: Trends, Leading Modes, and Associated Drought and Heat Wave in the Northern Hemisphere

Author: Deng Kaiqiang
He Shan
Tan Yaheng
Ting Mingfang
Yang Song
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

Global monsoon precipitation (GMP) brings the majority of water for the local agriculture and ecosystem. The Northern Hemisphere (NH) GMP shows an upward trend over the past decades, while the trend in the Southern Hemisphere (SH) GMP is weak and insignificant. The first three singular value decomposition modes between NH GMP and global SST during boreal summer reflect, in order, the Atlantic multidecadal oscillation (AMO), eastern Pacific (EP) El Niño, and central Pacific (CP) El Niño, when the AMO dominates the NH climate and contributes to the increased trend. However, the first three modes between SHGMP and global SST during boreal winter are revealed as EP El Niño, the AMO, and CP El Niño, when the EP El Niño becomes the most significant driver of the SHGMP, and the AMO-induced rainfall anomalies may cancel out each other within the SH global monsoon domain and thus result in a weak trend. The intensification of NH GMP is proposed to favor the occurrences of droughts and heat waves (HWs) in the midlatitudes through a monsoon–desert-like mechanism. That is, the diabatic heating associated with the monsoonal rainfall may drive large-scale circulation anomalies and trigger intensified subsidence in remote regions. The anomalous descending motions over the midlatitudes are usually accompanied by clear skies, which result in less precipitation and more downward solar radiation, and thus drier and hotter soil conditions that favor the occurrences of droughts and HWs. In comparison, the SH GMP may exert much smaller impacts on the NH extremes in spring and summer, probably because the winter signals associated with SHGMP cannot sufficiently persist into the following seasons

Columbia University Academic Commons

PIVOINE: Instruction Tuning for Open-world Information Extraction

Author: Chen Jianshu
Lu Keming
Pan Xiaoman
Song Kaiqiang
Yu Dong
Zhang Hongming
Publication venue
Publication date: 24/05/2023
Field of study

We consider the problem of Open-world Information Extraction (Open-world IE), which extracts comprehensive entity profiles from unstructured texts. Different from the conventional closed-world setting of Information Extraction (IE), Open-world IE considers a more general situation where entities and relations could be beyond a predefined ontology. More importantly, we seek to develop a large language model (LLM) that is able to perform Open-world IE to extract desirable entity profiles characterized by (possibly fine-grained) natural language instructions. We achieve this by finetuning LLMs using instruction tuning. In particular, we construct INSTRUCTOPENWIKI, a substantial instruction tuning dataset for Open-world IE enriched with a comprehensive corpus, extensive annotations, and diverse instructions. We finetune the pretrained BLOOM models on INSTRUCTOPENWIKI and obtain PIVOINE, an LLM for Open-world IE with strong instruction-following capabilities. Our experiments demonstrate that PIVOINE significantly outperforms traditional closed-world methods and other LLM baselines, displaying impressive generalization capabilities on both unseen instructions and out-of-ontology cases. Consequently, PIVOINE emerges as a promising solution to tackle the open-world challenge in IE effectively

arXiv.org e-Print Archive

DecipherPref: Analyzing Influential Factors in Human Preference Judgments via GPT-4

Author: Cho Sangwoo
Foroosh Hassan
Hu Yebowen
Liu Fei
Song Kaiqiang
Wang Xiaoyang
Publication venue
Publication date: 27/10/2023
Field of study

Human preference judgments are pivotal in guiding large language models (LLMs) to produce outputs that align with human values. Human evaluations are also used in summarization tasks to compare outputs from various systems, complementing existing automatic metrics. Despite their significance, however, there has been limited research probing these pairwise or

k

-wise comparisons. The collective impact and relative importance of factors such as output length, informativeness, fluency, and factual consistency are still not well understood. It is also unclear if there are other hidden factors influencing human judgments. In this paper, we conduct an in-depth examination of a collection of pairwise human judgments released by OpenAI. Utilizing the Bradley-Terry-Luce (BTL) model, we reveal the inherent preferences embedded in these human judgments. We find that the most favored factors vary across tasks and genres, whereas the least favored factors tend to be consistent, e.g., outputs are too brief, contain excessive off-focus content or hallucinated facts. Our findings have implications on the construction of balanced datasets in human preference evaluations, which is a crucial step in shaping the behaviors of future LLMs

arXiv.org e-Print Archive

NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization

Author: Brahman Faeze
Chaturvedi Snigdha
Song Kaiqiang
Yao Wenlin
Yu Dian
Zhao Chao
Publication venue
Publication date: 28/06/2023
Field of study

Narrative summarization aims to produce a distilled version of a narrative to describe its most salient events and characters. Summarizing a narrative is challenging as it requires an understanding of event causality and character behaviors. To encourage research in this direction, we propose NarraSum, a large-scale narrative summarization dataset. It contains 122K narrative documents, which are collected from plot descriptions of movies and TV episodes with diverse genres, and their corresponding abstractive summaries. Experiments show that there is a large performance gap between humans and the state-of-the-art summarization models on NarraSum. We hope that this dataset will promote future research in summarization, as well as broader studies of natural language understanding and generation. The dataset is available at https://github.com/zhaochaocs/narrasum.Comment: EMNLP Findings 202

arXiv.org e-Print Archive

Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models

Author: Chen Jiaao
Chen Jianshu
Pan Xiaoman
Song Kaiqiang
Wang Xiaoyang
Yu Dian
Yu Dong
Publication venue
Publication date: 14/08/2023
Field of study

We consider the problem of eliciting compositional generalization capabilities in large language models (LLMs) with a novel type of prompting strategy. Compositional generalization empowers the LLMs to solve problems that are harder than the ones they have seen (i.e., easy-to-hard generalization), which is a critical reasoning capability of human-like intelligence. However, even the current state-of-the-art LLMs still struggle with this form of reasoning. To bridge this gap, we propose skills-in-context (SKiC) prompting, which instructs LLMs how to compose basic skills to resolve more complex problems. We find that it is crucial to demonstrate both the skills and the compositional examples within the same prompting context. With as few as two examplars, our SKiC prompting initiates strong synergies between skills and their composition capabilities. Notably, it empowers LLMs to solve unseen problems that require innovative skill compositions, achieving near-perfect generalization on a broad range of challenging compositionality tasks. Intriguingly, SKiC prompting unlocks the latent potential of LLMs, enabling them to leverage pre-existing internal skills acquired during earlier pre-training stages, even when these skills are not explicitly presented in the prompting context. This results in the capability of LLMs to solve unseen complex problems by activating and composing internal competencies. With such prominent features, SKiC prompting is able to achieve state-of-the-art performance on challenging mathematical reasoning benchmarks (e.g., MATH)

arXiv.org e-Print Archive